//wave 1 dataset
drop if a_ethn_dv <= 4                                                          //Remove all non-ethnic minority participants from dataset
(40,436 observations deleted)
rop if w1attack == -8.0 & w1insult == -8.0 & w1avoid ==-8.0 & w1unsafe ==-8.0
(2,970 observations deleted)                                                    //Drop observations who have N/A for unsafe, avoid, insult and attack questions
drop if w1attack == -7.0 & w1insult == -7.0 & w1avoid ==-7.0 & w1unsafe ==-7.0
(943 observations deleted)                                                      //Drop observations who have proxy interview for these questions
drop if w1attack == -2.0 & w1insult == -2.0 & w1avoid ==-2.0 & w1unsafe ==-2.0
(7 observations deleted)                                                        //Drop observations who have refused to answer these questions
drop if w1attack == -1.0 & w1insult == -1.0 & w1avoid ==-1.0 & w1unsafe ==-1.0
(9 observations deleted)                                                        //Drop observations who have responded dont know for these questions 

//Wave 2 datatset
drop if w2ethnicity_dv <= 4                                //Remove all non ethnic minorities 

//Combined wave 1 and 2 dataset 
merge 1:1 cwID using "C:\Users\evieg\OneDrive\Documents\Understranding society analysis\Stata datasets\Data cle
> aning - new\Wave 2 EM only.dta"                                               //

    Result                      Number of obs
    -----------------------------------------
    Not matched                         6,251
        from master                     2,115  (_merge==1)
        from using                      4,136  (_merge==2)

    Matched                             4,514  (_merge==3)                      //Merge wave 1 and wave 2 datasets (ethnic minorities only)
    -----------------------------------------

. drop if _merge == 2 | _merge == 1                                             //Drop observations who participated in wave 1 or wave 2 only 
(6,251 observations deleted)
. drop if w2everdrank == 2                                                      //Remove pp's who said 'no' to ever drinking
(697 observations deleted)

. drop if w2everdrank == -8.0                                                   //Remove pp's who have N.A for ever drinking
(951 observations deleted)

. drop if w2everdrank == -7.0                                                   //Remove pp's who had proxy interview for ever drinking
(153 observations deleted)


//Accounting for complex survey design
vyset w1psu [pweight = b_ind5mus_lw], strata(w1strata) singleunit(centered)    //Inform stata of survey design 

Sampling weights: b_ind5mus_lw
             VCE: linearized
     Single unit: centered
        Strata 1: w1strata
 Sampling unit 1: w1psu
           FPC 1: <zero>
		   
svy: mean w1age_dv1                                                             //Example of population size issue  
(running mean on estimation sample)

Survey: Mean estimation

Number of strata = 465            Number of obs   =        956
Number of PSUs   = 828            Population size = 123.329484
                                  Design df       =        363

--------------------------------------------------------------
             |             Linearized
             |       Mean   std. err.     [95% conf. interval]
-------------+------------------------------------------------
   w1age_dv1 |    36.9141   .6628386      35.61062    38.21759
--------------------------------------------------------------
Note: Strata with single sampling unit centered at overall		   